feat: competitive review pipeline improvements#33
Conversation
…sing Phase 1 improvements to the review pipeline, informed by competitive analysis of Greptile, CodeRabbit, and Qodo Merge architectures: - File triage (#29): Heuristic classifier skips lock files, generated files, deletion-only, whitespace-only, and comment-only changes before expensive LLM review. 45 unit tests. - Dynamic context boundaries (#25): Context fetcher now searches upward for enclosing function/class definitions instead of using fixed-window context. Uses asymmetric padding (more before, less after) matching Qodo's approach. 6 unit tests. - Robust LLM output parsing (#28): 5-strategy fallback chain for parsing LLM responses — primary format, numbered lists, markdown bullets, file:line format, and JSON extraction. Prevents silent comment loss when models deviate from the expected format. 11 unit tests. - Fix rust-embed v8 derive macro (Embed -> RustEmbed) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Verification pass (#23): LLM validates review findings against actual code, scoring 0-10 with auto-zero for noise (docstrings, type hints, imports). Filters hallucinated comments before output. 17 new tests. - Multi-model routing (#26): ModelRole enum (Primary/Weak/Reasoning/Embedding) with per-role model config and fallback chain. 12 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4-stage progressive degradation strategy for diffs exceeding token budget: Stage 1 (Full): all diffs fit → use as-is Stage 2 (Compressed): remove deletion-only hunks, drop tail files Stage 3 (Clipped): truncate large files at hunk boundaries Stage 4 (MultiCall): split into multiple batches (up to 5) Integrated into review pipeline — files dropped by compression are skipped with logging. 24 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevent compression analysis from skipping files in per-file review mode, make verification diff truncation UTF-8 safe, and tighten comment-only triage to avoid dereference false positives while improving compression packing order. Made-with: Cursor
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of failing the workflow, emit a notice and skip the review and comment-posting steps when OPENAI_API_KEY is not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ding default - `#` comment prefix: exclude Rust attributes (#[...], #![...]), C preprocessor directives (#include, #define, etc.) from comment detection - `generated/` match: use boundary-aware checks (starts_with or /generated/) to avoid false positives on paths like `user_generated/` - Embedding model default: fall back to primary model instead of hardcoding OpenAI's text-embedding-3-small, respecting self-hosted provider support - 4 new tests for the triage fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…errors Use the Weak model role for the verification pass — it's a validation task that doesn't need the frontier model. This also resolves the clippy dead_code errors for ModelRole, model_for_role, and to_model_config_for_role by giving them their first real callsite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| // 3. Deletion-only — all non-context lines are Removed | ||
| if all_changes | ||
| .iter() | ||
| .all(|c| matches!(c.change_type, ChangeType::Removed)) | ||
| { | ||
| return TriageResult::SkipDeletionOnly; | ||
| } |
There was a problem hiding this comment.
🔴 Triage SkipDeletionOnly silently skips review of security-relevant code removals
The new triage system classifies any diff where all non-context changes are Removed as SkipDeletionOnly, causing these files to be silently skipped from LLM review (src/review/pipeline.rs:311-320). This means removal of security checks, validation logic, error handling, authentication guards, or rate-limiting code would never be reviewed.
Example of dangerous deletion-only diff that would be skipped
If a developer removes an authentication check:
- if !user.is_authorized() {
- return Err("Unauthorized");
- }This is a deletion-only change. The triage at triage.rs:90-94 classifies it as SkipDeletionOnly, and the pipeline at pipeline.rs:312-320 skips the file entirely — no LLM review occurs for this security regression.
Deletion-only changes are a well-known category of security-relevant refactors. Unlike the pre-existing diff.is_deleted check (which skips fully deleted files — a reasonable heuristic), SkipDeletionOnly applies to files that still exist but had code removed, which is exactly the pattern for accidentally removing safeguards.
Prompt for agents
In src/review/triage.rs, remove the SkipDeletionOnly classification entirely, or make it configurable. The deletion-only check at lines 89-95 in triage_diff() should be removed because deletion-only changes can represent security-relevant code removals (e.g., removing authentication checks, validation, error handling). If kept at all, it should be gated behind a config flag like triage_skip_deletions: bool (default false). Also update the TriageResult enum, should_skip(), and reason() to remove or gate the variant. In src/review/pipeline.rs around line 311-320, the triage skip logic is fine as-is — it just needs the underlying triage heuristic to be less aggressive.
Was this helpful? React with 👍 or 👎 to provide feedback.
| ]; | ||
|
|
||
| /// Comment line prefixes (after trimming leading whitespace). | ||
| const COMMENT_PREFIXES: &[&str] = &["//", "# ", "/*", "*/", "* ", "--", "<!--", "\"\"\"", "'''"]; |
There was a problem hiding this comment.
🟡 COMMENT_PREFIXES includes "* " which false-positives on Markdown list items and other non-comment content
The is_comment_line function at src/review/triage.rs:176-193 uses COMMENT_PREFIXES to detect comment-only changes, but it's file-type-agnostic. The prefix "* " (intended for C/Java/Rust doc-comment continuation lines like * @param x) also matches Markdown list items (* Fix the build process). A diff that only modifies Markdown list items, horizontal rules (--), or headings (# ) in .md/.sql/other files would be classified as SkipCommentOnly and silently skipped from review. For example, changes to README, CHANGELOG, or API documentation that only modify list items would never be reviewed.
Prompt for agents
In src/review/triage.rs, the is_comment_line function (line 176) should either: (1) accept the file extension as a parameter so it can apply language-specific comment prefix lists, or (2) the triage_diff function should exclude known non-code file extensions (.md, .txt, .rst, .adoc) from the SkipCommentOnly classification. The COMMENT_PREFIXES list at line 44 is correct for code files but causes false positives for Markdown/documentation files where '* ' is a list item, '# ' is a heading, and '--' is a horizontal rule.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then | ||
| echo "skip=true" >> "$GITHUB_OUTPUT" | ||
| echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured" | ||
| else | ||
| echo "skip=false" >> "$GITHUB_OUTPUT" | ||
| fi |
There was a problem hiding this comment.
🔴 GitHub Actions secret directly interpolated into shell command allows injection
At line 33, ${{ secrets.OPENAI_API_KEY }} is expanded by GitHub Actions' template engine before the shell runs, meaning the raw secret value is pasted directly into the if [ -z "..." ] command. If the secret contains shell-special characters (e.g., ", `, $(...), ;, |), it can break the script or, worse, execute arbitrary commands.
Safe pattern using environment variable
The standard fix is to pass the secret through an environment variable:
- name: Check API key
id: check_key
env:
KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
if [ -z "$KEY" ]; then
echo "skip=true" >> "$GITHUB_OUTPUT"
else
echo "skip=false" >> "$GITHUB_OUTPUT"
fiThis way the value is never interpolated into the script text itself.
| if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then | |
| echo "skip=true" >> "$GITHUB_OUTPUT" | |
| echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured" | |
| else | |
| echo "skip=false" >> "$GITHUB_OUTPUT" | |
| fi | |
| if [ -z "$OPENAI_KEY" ]; then | |
| echo "skip=true" >> "$GITHUB_OUTPUT" | |
| echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured" | |
| else | |
| echo "skip=false" >> "$GITHUB_OUTPUT" | |
| fi |
Was this helpful? React with 👍 or 👎 to provide feedback.
| pub fn find_enclosing_boundary_line( | ||
| content: &str, | ||
| file_path: &Path, | ||
| line: usize, | ||
| _max_search_lines: usize, | ||
| ) -> Option<usize> { | ||
| let language = detect_language(file_path); | ||
| let boundaries = detect_function_boundaries(content, &language); | ||
| // Find the innermost function containing this line | ||
| find_enclosing_function(&boundaries, line).map(|b| b.start_line) | ||
| } |
There was a problem hiding this comment.
🟡 find_enclosing_boundary_line ignores max_search_lines parameter, violating its documented contract
The function's docstring at src/core/function_chunker.rs:285-287 states it "Searches upward from line up to max_search_lines", but the implementation at line 292 prefixes the parameter with _ and completely ignores it. Instead it scans all function boundaries in the entire file. The caller at src/core/context.rs:56-57 passes 10 and then applies its own post-hoc filter (boundary >= start.saturating_sub(10)) to compensate, but if anyone else calls find_enclosing_boundary_line trusting the documented parameter semantics, the limit won't be respected.
Prompt for agents
In src/core/function_chunker.rs at line 292, the _max_search_lines parameter is unused. Either implement the documented behavior (filter boundaries to only those within max_search_lines above the given line), or remove the parameter from the function signature and update the docstring. If you keep it, the implementation should filter like: find_enclosing_function(&boundaries, line).filter(|b| line - b.start_line <= max_search_lines).map(|b| b.start_line).
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
ModelRoleenum (Primary/Weak/Reasoning/Embedding) with per-role model config and fallback chain10 files changed, +2485 lines, 114 new tests (716 total, all passing).
Test plan
cargo test)🤖 Generated with Claude Code